Bioinformatics A Practical Guide to Next Generation Sequencing Data Analysis (Hamid D. Ismail)

RNA-Seq Data Analysis ◾ 195

Fitting the count data to a negative binomial model generates several data as shown in

Figure 5.20. For instance, the GLM coefficients are in “coefficients” slot, fitted values are in

“fitted.values” slot, and the estimated quasi-likelihood dispersions are in “dispersion” slot.

The quasi-likelihood extends the negative binomial to account for gene-specific variability

from both biological and technical aspects. We can visualize the quasi-likelihood disper-

sion with “plotQLDisp” function. The quasi-likelihood gene-wise dispersion estimates are

squeezed toward a consensus trend, which will reduce the uncertainty of the estimates

and improves testing power. The following script creates a quasi-likelihood dispersion plot

showing the raw, squeezed, and trend dispersions (Figure 5.21):

jpeg(‘qlDispplots.jpg’)

fitq <- glmQLFit(yNorm, design)

plotQLDisp(fitq, pch=16, cex=1.2)

dev.off()

Once we have fitted the count data to a GLM log-linear model, we can then be able to con-

duct the gene-wise statistical tests for a given coefficient (coef) or we can use “contrast” to

FIGURE 5.20 Quasi-likelihood negative binomial model slots.

FIGURE 5.21 Quasi-likelihood dispersions plot.